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listMi' in.iiy-piopt l.itrly in tw.i I ii.i r iM)ns , rrsnljs ,i\r 
nlti'M i ru onr 1 us i vr . In .1 1 rv ( rw o f • ooi) rv.ilii,i~ 
t i on I r|>ui I s I r om riluc it i on.i 1 (> 1 o j im- r s , ' Mn i s t , 
T.il Im.ulKiS and Wnod (Pi7')) icirnrillrd cominon ci ioi 
or ii.ut.inls in fvalnntloiis .uul in tlH•^ll^a' i.l trsts. 
ilu' h.i/.u.ts niui w.iys Lo av<H.i i linn -u-r |.Mv;rnti'(| 
ht'lnw; Any onr ol t lirm V.ui i rw.i I iil.u r iirluM - 
wise rjniiiui r v.i I n.i t i on , and shoultl l>r avoidiMf. 

• Hazarcf 1 : Jh^ Use of Grade Equivnlent Scored 

(a ad ('--I'll II i V.I I flit S( 01 ivl prov I di* ItiMMi?; 1 - 

t i V f , .V4id , in siMiir i ns t .uicos , ,1 r;yr;t omm t i in I.l y, 
liistortrd, assrssmrnt ol (i>>',nitivo >\rowili. ihr 
t-oniopt ni .1 v;»'»tio-i'qt»lvali'nt smic is mlshMd- 
InK— rxamph', a k< '^^It'-^^M*' * vM rnt sooio nl flvv* 
^ attainiMi hy a third ^;radrr on a niatli tost dors not 
moan .that he knows f il Mi-j»rado math. I'os.^iHly Uv * 
can ifr> tliir(i-^rade math as wll as'tlio avrra^^o 
tilt:li KT'iti^'r, hnt it is llkrly,that no tilth-p.rado 
stn(iiMUs havo ove^r takon tlu? t h i rd-p.rado Iov/l^i' 
t>f tlu* tost. 

Tlu; use of ^rado tMpiivalont^ for evaluation 
purposes rroa^es a soiond [)rohlom in tl^at thoy 
do not form on oqua I- inte rva I sea lo, and should 
novor he avor,iv;od. Finally, Rrado oipiivalonts 
are const rur tt^<i based on the assumption that 
growth occurs at the. s.une rate trtiroughout the 
school year. Research has shown, however, that 
learning ty[Ucally does not lollow this regular 
[)nttern and, whenever this is the case, gains 
measured in grade equivalents will he artifici- 
ally Inflated or reduced. For a C()m[ilete dis- • 
cusslon*^ of their problems, see Technical Paper 
No. I ent jtled What^s-Bad about c;rade-Knui va lejit 
Scores. 



Ha/iltd 2: Thu Uso of Inappropriate StuiistiCiil Ad)ustiiiunts 
with Nbnuqtiivulent Control Groups 

lluTi'* .Ml' Hi'ViM.il jil 4i t I Si 1 1 4i I piDmlmos llMt 
ill c w! ill' I y iisi'il 1 11 an at liMiipl cmwpcixa.xi c i oi 
'liiitl.il il 1 I I I' i iMii i'S l)i'lwi'iMi li i^itnR'ul anil ri)ti(ii)l 
gi oiipVi . Sonu' ai l' I I'g 1 1 Ima 1 1' wli I Ic ot lu' i s ai i' nt)t . 
Mak I iig hci wc'i'ii-^gri)ii|) rompar Isons ns I ng r 1 1 lu'i 
'Saw*' giiiii siM^i^^t^ .^^i " ros lihiti I " g*iifi sroii'.s tails 
iiili) the lalti'i lati'gDiy. Hotli piovnlnii's should 
1)1' sr I iipn I oiK-j I y avoiiloii. 

A law ^alii SiDii' Ls simply t lu', il i I 1 c ronro be- 
L wo I'll a |)i 15- ami a post I es t score (mil re I I cr t s t lie 
ga 1 11- mailt' hv t wiu'n I I'Sl I ngs 1 1 Is a rgneil t ha I , 
.4iltliiuigh twi) grimps may have heeu somewiial illlter-* 
eiiL In Lerms ot Initial achievement levels, tlielj 
ex per. t eil gains wi) n I il be riuigii I y comparab I e at t e r 
the s.ime eihiCti t I oiwi I treatment. Ih I s vA>nlil be 
t rne however , oi^y when each group's posttest 
standard devl*itloii Is the s>uiie *is Its pretest' 
standard devltUlon. Wliere the posttest standard 
dev lat i oiis *ire larger tlnni those o t pretest scores , 
a raw gain score analysis will systemat lc*i I ly un- 
der est Imat e t rea tment el 1 ec t s . Converse ly , the 
procedure will systematically overestimate treat- 
ment el tocts wiiere the standard dey lat Ions of 
pro test scores exceed those of post test scores . 

A residual gain score Is the difference bo- 
tween an actual posttest score and a posttest 
score i^stlraate doxlved from the combined tre^t^ 
ment and cont ro I group regress ion 1 ine . Presum- 
ably the mean residual gain score for a group 
wtiich received an tjffectlve treatment VNOuTd be 
positive while that for the^coot rol ^j^roup vA)uld 
be nega t Ive . Also , t Fie sum of the absolute values 
of the two dlf ferences would provide an index of 
the size of the treatment effect. Unfortunately, 
it can bo shown algebraically that a residual gain 



t^^;t sioirs .n (» <m|ii,i I , Km t luM ^noi r , { \\r .iinodiU 
i>t^ imdfM St .It iMiirnt is diitMtly t t i on.i 1 (n tin' 

o\ till' Iviiti.ll li I t I CM (MH*r |)i*tW«MMl V.^^^'^P"'- 

IhiM r .Mr otIuM t.utois, siirh ,is how t lu' t i iM t 
iiUMit niul roiUiDl wiM r loinu'd, which di'tiM- 

m i lu^ t hr lopr i .It o .ui ) w!; t mumi I; |>ro( lulin i' t d i om- 
piMis.it i> t ^ t \\v 1 1 "initial ill M ommhi'n . I it to 
liMMniif.U Ni>. 1.' iMitltlrtI S^M t i s t i i> a 1 Ad- 

J_!i'i^.J'L*.'J LL- ^ L* LL _^ oiu^t[u iv. 1 1 iMi t . ( : 1 ) n t r 1) 1 ( i 1 1 > t ij^> s tor a 

mm f 4*i>ni|> 1 r t 1^ liiscuHsion ot t h i s . 1 1)|> i,i' , IUm i' it* 
is ^uttirliMit ti> point duJ t Iwi t ncithiM raw nor 
rr s i litia 1 j\a i n scnrv ad ) ns t nuMit s i aiitMpiat i' . 

Hazard 3: The Use of Norm-Group Comparisons with 
Inappropriate Test Dates 

1 n norm- If I t»rtMirrci twa 1 uat i t>ns , t rst slioul d 
hv admi n i St iTod -at lUMrly t hr sanu' t inn^ as t lu' 
tivst ,[>ub 1 iMlior tr.stiMi the norm ^;h>m|>. WIumi con- 
trol ^;rotips -are avallablcj, t ew e va 1 uat ors woulii 
consider testing the treatment and control ^ronps 
m(,>re than a few days apart. Wlien norms are used 
as a snhstitute control j;ronp, thiji same consid- 
eration needs to be given to test dates. 

TreatmiMit grt)up students shtnild be tested 
within twi) weeks of the midpoint of the interval 
during which the normative data wi^re collected. 
Testing within six weeks of empirical nor^native 
data points is permissible if linear Interpola- 
tions or extrapolations 'of the normative data are 
made. Tests that provide normative data for only 
i)ne point in the year should not be used in fall- 
to-spring norm-referenced evaluations. 



Hazard 4: The Use of Inappropriate Levels of Tests 

H most of the pupils achieve very hlgli or 
very low .test scores, the* level of the test may 



Lh' in.ippi opi i jt i» lor assi»s.*i i t'hr I r piM I o rm*nu*i' . 
It pupib. iUiiui^it r t t lu* test t 1 DIM .It pii'ti'st t inu» 
o\ (hi' i^iliny' jt [u»sttivst tlnii», tiratuu'iU I'lli'iNs 
will hi* mull' I r .s t iiiM t . (Idiwim .si* 1 y » i ( I lu* i 1 - 
1 n^' is iMu f)iinl I' i imI iu\ t hi* pr c t I'.s t oi l lu' I I ow 
t lu' pDstti'st, ^.1 i IKS '.wl M b<» ovi'i I'st liu.U (mI . UimI- 
I y , St uili'ut s lihoiihi :»t dti* in ilu* miiidU' o{ lUc 
iwwy^r i)\ poss il) 1 1» I .iw siM» rv"^ • 

Ti'sl li'vi'ls .sliouhi he si'liu tiMi on I lu' l)cKsi.s 
o t I lu* .ii.h i i-v I'liUMi I I I'Vi' I .s 4) t t lu' ^a mi I'H I .s , lioi ' 
on I lu* [j.isis 1)1 t lu' i I >;i*ui<'' in srhool. in ino.sl 
iMSi'S, tlu» noiniiijtly r<M*t)mnuMuii'ii ti'si U»v<'l or tm<» 
li'vi'l 1)1' low will 1)1' su i i or t us t i ii); i i t Ir I 

st^uviiMits. Si'i' I'lKlmitMl l\ipc?-r No. h cntitlcMi 
Out -ot -'i.i vc I lest iii^ lor atiiiilion.il inlormat ion 
on t li i K . t t)[>'ir . 

Usin); a test U*vi'l ullur tluni^ that tuiminally 
reiH)iamondi'ii lot a particular ^raiii* is likely lo 
luean t ha t iu)ruis I ah I os tor I he I es led s I nil en I s 
are noL^ inejiuieil in llu' test manual. However, 
it is wo I (iu*.ni i ng 1 11 L lo assess e i llu' r status or ^ 
i;aiiis hy eomp.irisous witli student's at a dittereut 
>;rade level. Mic status ot a sixtli grader should 
l)e assesseii using sixth-grade norms even it he 
is testeii with a t our t h-i|» rade test.* Most m<ijor 
tust ' [)ub I Isliers , tortuiiately » have intcrlvxked 
.tlieir test, h'vels hy prov Ing- an expanded s tan- 
da rnJ seoto scale whicli enables the determln<it ion 
i)f'*sci>re e(|u Iva leuc les betVA^en atijacent test 
lovels. These seoVes make It possible to predict 
t rora n pupil's scV)re on one test level how he 
would liave scort^d on [he next hlglier or *lower 
level, tlius providing access' to the in-level 
no rms . . ' 

Hazard 5: Missing Test Scores * 

Ana ly yes i)f evaluat ion data should be based 
only on ^those students with boOi^ pre- and pi)st- 



t t'Sl sii) I cs . I n I f r p 1 1* I .1 1 1 Dii o I l Iu'M' il.M .i , Ih>w~ 
c'vi'i' , shniil li I .ikr inlo .ic count t lu* i h.n .u' t I'l^ r.l 1 1 
ui I he .s I uil I'lU s wlu) «.IV()|)|)r(l oul , (Miliiiil Lili, i)» 

t .liliui I imI t I I lu* pi o) fit. \\)\ ^ II \\\\ 

o\ I hi* lowrsl .S('0!lnj\ stniirnts on ihr pifUvM 

diOpplMi out I)l'<-«)»t' '[)l).Sll(\St t l!\U', I lU' aVlM .IJ'.l' 

pi)S I I I' s i m: oi 1^ wou I li 1 iir I iMSr w i I h m's [iii- 1 Id f tu 
pi t'lt'St si'oi rs sini|)ly l)c'r.inSL' I lu* iulsj.tnij; stu- 
(it'nl s. riUs liu rt'.isi' coul il hr in i >i 1 ii I v i I* * 
.1 1. i ki'wl , It tlu' h 1 j»,h-.s\i>i i n^; sludiMil.s 

I lul IjDm I lir j»,i.n4ip, tin' iiumm post Irs I mom 
wou I il hr <n I 14 1 i .1,1 I y ilr I I .It ril . 

\ o .ivo it! this ha'/,i r li , iWi'i y r t 1 i> i t iiwis t he 
in.ulr to oht.i.in [)i r- .»ul pt^stli'Si sroirs lor rai l» 
projrrt |>.n t i r i [).in t , And t s) h.isi* i-oinp.n i sons on 
tho.sr St uiirnt s \ iU whom hot h sroi os avc .iv.i 1 l*.ih U 
D.a .1 t roiu s t luliMit s h.iv 1 nj', an I y prr t I's t oi i)n 1 y , 
[)osttrs,t. srotfs nuist he r.u'rluMy rx.iminrd to sti 
(I I hay d 1 M t' r In somi* sy s t I'lna t i r wa y 1 i oiu tlu* 
il.ita ol stiulrnts having hot h [)i c- anj po.sttrst 
.srorrs. A drsr r ipt ii>n o'l any ot thi'sr ditiritMiri 
shi>nl(l hf Inrludod in tht- I'valualion iV[)oi*l. 

Hazard 6: The Use of Noncotnparable Treatment and 
* ^ Control Groups \ . * ^ 

This haz.^rd is clo.srly rrlatoti tt) llazauls J 
and 8. li) i:c>nv*^iit iona 1 rxpiximt'n t a I tlosi^ns, 
I roatment am! rontrc)l groups should he similar 
in . a 1 1 /odiira t ipiia I ty relevant rrsprrts hotore 
the t reatmeiit be^ ins", (Groups which d i t 1 er in * 
.terms' of pretest scores presdnt an ohvious sc)urre 
of bias. Other "more subtle fbrtors sudji as (iif- 
t e reucti^ in age , sex , rare, or sor i oeronoin i c 
St at lis cai\ a I so exert st rong bins ini \ unices 
and ^should be avoided, Nonvo lunt ee r s s.hould 
neve r be used as controTs tor pupils wivo volun- 
teere^d (or were volunteered by theix^ parents) 
t or a part icular inst rue t loira I treatment , 

-> • s 



' t o J I iM I mrn I .uul t onli ol ^'.mups on ,i i .iiul»>m b;isls 

pup I I s i'o.u I (\ I .mdom I v .iss 1 iumI to t 1 i sr - 
SIM oinl-srnu'.M ri >'.MMi[),'i. For I lu* tlisl h.ill n{ 
till' yt'.ir, Din' "^^'P *J ^ snvr .is l lu* cmWrol 
^•.i.t)up toi ihr'otlu'V, l)ul both j;inu[is wiiuld ul^t Im--. 
.iliMy rrrt'ivt' iMpi.i I amounts ol t\\v li(\Umrnl. 

In soim* (MSi'S, [U o-i'x isr In^ ^;toups wfil he 
ruou^'.li aliko so 'tli.il l liry f.ni .ippr opi I ,i I y lu^ 
cons i (iiMt'tl o(pi Iv.i I fill t'O I amloni sanj^filos Mum a 
siiij.'.h» po[)ulMtlon. hi ot ln^r casus , a' conriij 
f\roup will hr known lo dlftor syst cmat I i\i I | y tUmi 
till' t riMtnuMit ^;roup, WIutc li 1 1 I cm fnc^^ is 

small, t lio oonl I o I ^ oup model may still provido 
t lu' mot liod ol I'va I ua t 1 ng , t lu' project, and 

Stat 1st iial ad ) ust miMit.s can be made to cvMupon- 
sato lor bet wiMMi-j\roup difioriMicos (,sci^ Ti^cbnlc.U 
ra[)cr No. 12, entitled Stat tsttcal Adjus tment s 
t ot NDiiequlv a lent Contrcil (.roups ) , Wlier.e t be 
ditferences are large, bowever, tbt^re Is .110 way 
In wbicb a noiicomp.irab I (\ eont ro I group can pro- 
vide an accurate I'Stimate of bow wtM I tbe treat- 
ment group v^) ul d bave- dpne wl t bout t be t ri*a tmeiiC . 

Hazard 7: The Use of a Single Set of Test Scores for Both 
Selecting and Pretesting Participants 

Wlien students are selected for part Ic Ipat Ion 
i n a spec la I group because i. tbey obtained re la- 
tlvely higb or relatively, low scores on some test, 
use of these scores as pretest measures Inva 1 1- 
dates any kind of norm- referenced evaluation, 
Tbls problem stems from what is known as **statls- 
tlcal regression," "regression tovmrd the mean," 
or simply, the? regression effect • For a discus- 
sion of this topic, refer to Technical Paper No, 
3 entitled The Regression Effect , 
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s.mir oi . .1 ,i'onip,ir.il>lt> t<'St, tlu^y will ^uoiv lii^'hri J 
on tUr .ivri M^'.r/ whi Ir '.ni I iM t i .1 1 I y li i - o i i 
Kionp win .s. ^rr *!(>w<'t • Tlu* n'Siilt is thai low- 
st ()i in^» >\iou|>s .i|>[>tMr to.lr.un moM« liom .i sfx'i ial 
|»! o^;! Mm 't h.in tUry artii.illy do^ wlilli- )v»ins in sp**- 
projM.ims lor li iy.li-s«or I ii^ stNufriitr; m.iv hi* oli~ 

SCU! I'd , . ' ' 

T«» avoid t li is* hazard » stiultMW.s should hl^ .sr- 
ItMtt'd toi p,u r i r i pat tiMi In a sp«TLal tiratnu'iU 
l>a.stNi on out^ si't of tost st tu t's and thou hi' pio- 
r «^st tMl^ u.s.iuj>, an alt«*rnat«' J tu in ol t lu^ sainr ii'si 
t»r a dittrrt'ut tost. A pi'iitMtly I i t Imnt o al- 
ti'rnat ivo is tt» baso student .M'liM tidu on trarhor 
rocoiniTK'ntlal ions km rlyissiimm p^iadi'S, 



Hazard 8: Constructing a Migtched Control Grojjp After 
the Treatment Group Has Bi?en Selected 

• 

Kindin^; "matc hes" for t reatmcMit part tr Ip'ant s 
in' some other |i\roup Is a fundamentally unsound 
practice. Unless they and tlie treatment pupils 
are equally representative of the >-,ronps from 
which they are drawn, statistical regression will 
act differentially or) the two groups and artifi- 
cially Inflate the apparent gains of one group 
with respect to the other.* ^ 

In the mostlBbman situation, the group(s) 
frt>m wiilrh the miTchlng control pupils are drawn 
♦wl l^l be higher achieving, than those from which the 
tr^.itment group pupils are selected. Consequent- 
ly, the control group pupils will be farther he-* 
low the mean of the group(s) to. which ^hey belong 
than the treatment group children. On ^etestlng 
they will thus show greater stat 1st lea I regres- 
s^lon and their posttest scores will be too high 
to serve as a no-treatment expectation for the 
Title I participants^ • 



\ \ \u I cct [>ri)i riiun* i or i»s t .il) 1 1 1 uj», ni.it c\wi\ 
control ) * oii[>.s i ^>' lo do t lu' iiMti'liin^; i 1 i st .iiui 
I Iu'H .I'.r' I ^'O nu'iuht' If. o t iMrh p.i i i i .uulomi y to I 
M iM I au'U I oi I lu» t'OU I I n 1 j»,rou[) . I'h.i t Is, .t I .W ^-.o 
>;ioup ol Nlihlrnls, .ill *'lij;ll)l*' lo in I lu' pio|- 

I'l't , UULSI he .IV.J i 1 .il> 1 1' . Ml si fitrp is In d I - 

vide I lu' >',r«Mi|> iiUn m.itilu'li pii i 1 f» h.iStMl on i 
spoil's, h.ic k I oniui , St*x,u*tr., so tti.il 

two uKMiibiMS ot p.iii .no .iS* sTmil.u dri pos:; i - 

I) ! I' . 1 lu- ll , .» t I o I I lu' lu.i t I li i nj; p I orofo» I ^. romp I o I o 
S( »mo I .Muloiu p ror oilii 1 o siio li Nipping', >i oo i n 

.should hi' u.sod [o doo ulo wtiii'h moiuhoi oi o.u h p.i i i 
^\oo.s 1 n I o I lu' I I o.i I tnont and wh 1 1' h into i ho oonl ro T 
^; I o n p . 

Hazard 9: The Careless Administration .^ind Seoni*g>Df Tests 

TosLln>\ imisL he «u'oompl l.sliod y\ lU .sr r upiil oils 
.ll t I'nl 1 on to dot li i I . For rtiosl e\\\ \ uaI ion modo Is , 
I lu' [) r iiii.u y r o(pi i nnuont if* t h«T t l ro.u oumi l and co\y^ 
liifl Ol I'oiup.i r 1 son ^roTips be tt'i>lod in ox*u!lly l/lio 
saino way. Minor variations i roiu l\\e [)ri^t*t'duri's 
de.soril>oii l)y the test publisher an* pi* nn iss i b 1 1' . 
I n norm - r o t er enoed ova 1 yiii t i oiis , t nwl I nuMi t ^toiips 
slu)uld 1)1' ti'Sted In the siune way as the students 
l»n the no rin ^roup . Th i s retpil r I'ment nuMiis tli.it 
proei'duri's outlined by thi* test pul)lisher must 
be followi'il prei'isely. % 

l^rob I em s .ir i se It t I'S t s are admi ii i st red or 
siored in an iiu'ons istent and Ciireli'ss nuuuier. 
It there *ire dltterenres in the w.iys in whleli the 
test t ake r s .md the norm group s t iidiMi t s are t es t ed 
or it thire .ire li i t tereiici'S In tlie pronidures, 
conditions, <ind scoring at pretest and posttest 
t imes , then it is impossible for the resul t ing 
data to be accurately Interpreted. Tliere are no 
statistical manipulations that can compen^at,*? tor 
mistakes made in *idmlnister Ing or scoring a test. 
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10 avDlii ihlN ha/;iiii» the ti>IIi)wlu^; slips 
slu>u 1 li bi' 1 4i kiMi ; 

J4^1i'Sl 4uim I n I ;il t a t li)ii auA s\oilnj> piiurtUiii's 
must 1)1' i\y^in x I y I lu« s.umi- I oi t hr I i i»,W ninU 

Ki i>u|) usi'ii li> ^',i'iu*iMt V I lu' no- I I iMlmrnl rxprt^ 
I. It li'Si In^ ti iMtnu'ht ^'.i t>iip pup 1 1 s In 

fx.u- 1 1 y I lu' s.uiu* w.iy .i^ p^vlls lu I he* iu>nn i nj'. 
s.impU' me.iiKS li)llnwln^; ilrv^^^st puh 1 I slu' i ' s 

11 1 1 I'd I i i>ns ill I'VLT^ tirt .1 1 

J. Tlu,' p^^?i'iuliiti's , CDiiii 1 1 Idhs, cMul sii)rln^; iiu'l h- 
ihIs 'ii.^fil ilurliiy; pi)S t i»s t I uy- luusl hi' t^k.irlly 
iUv s.uiu' .is'^HJ^Dse iisi'ii liiiriny* pii'trsl iwy^. 



Hazard 10: The Use of Different Instruments for Pretesting 
and Posttesting 

I n'sHu* nonu-ret crc'iii t'd dos l^;*ii , it Is iu)l ad- 
vls/lblt* li> change Ltst.s hotwucn pit- .uid posLtost- 
Ing because cliere Is no aiieqnate way Iky tinnpare 
pretest scores un oi\e test wl tli post test scores 
iHi a completely dlfterent test. Slnie eacli test 
piibllslier follows sllKbtly dittereiit iiormhiK prac- 
tices » it is likely that- one test's ni)rmtj,wiU be 
sllglitly ^'easier" than aitotlier's. Tlas dit|ereiice 
does not matter It ,tlie s.une test is used both pre 
and post but could raagnity or obscure real gaini;^ 
it changes w re made. Willc It is not essential 
to. use the s*imo f o rm and Uwe I ot an achievement 
tedt pre and post, this p^ractice is also recom- 
mended . 

Some tests have been developed so that the 
lower levels are intended f<^r use at tlie end ot 
one grade and the beginning ot the next. In tliese 
instances, .to use the ^ame form and level of test 
for fall pretestijig and spring posttesting^ it 



will be n«uf>5;5snry cithor tt) .prt't^st or postLcsL 
()uL-r)f - Irvr 1 . In some j*.r«jdi\s wticre, S[)r I n^;- Lo- 
sprlr]^, nr f .i 1 1 - t o- f a 1 1 j»valuations ,jre cnnduf Led , 
it may bf ruH f^ss.iry Lo fiance LcsL levels in 
firder Lr) avoid rrllln^'. or floor fffccLs; iinfor- 
t una LiM y , t h i s^^rar t if f wi I 1 , Int rodnr :i» Vin unknown 
amount of irrrfr iriLo tin* measure of ^;ain. 

Hazard 11: The Use of Inappropriate Formulas to Generate 
No Treatment Expectations 

Miny j)ro t s use an unreal 1st, ir thcorcl lea 1 ^ ' 
mf)(lp I or f or mu 1 to r a 1 c uJ a t c "(^xpcM Lrd" [)ost Lcfit 
fictitrs from or oLb»'r pretest scores. If stu-. 
denLs do b»*t(rr than the raleulaLed c 'Xpc -r L <i t 1 on , 
t li e p r o ) r r t is r on s 1 d c ' r * ' d a stir c «*s s . 

^ Ml ny mc- 1 hods hav *• Ix'fn devised for ( n I r ii I a t - 
iu>; ()«^r f orm.int li'v*> 1 exper t a L i ons wijleh resL ori 
unt enaf> 1 r . assu[ii[)t i ons . Neither K} srores nor 
Krad»— «>«ju 1 va 1 rnt seorf-s shoulfj be usefl ro ^»rnerate 
no- t r«.si t mrnt «'X[)c'f t a t 1 mis . For exam|)lo, a .student 
who fiafi ^'aliM'fl .7 years per year* f)n the avera^.e, 
slnre fje^Jnnlnp. srhf)ol, Is presumefj to eonllnue 
a t t fie same ra t r' un I f>s.s a ;^ pe'r 1 a I [)r o^^ram in- 
creases fi i 5r rate. Unfort luiate 1 y , rade-ef]»i i va 1 ent 

1 ns me.is u r erl f rom fall 1 o spring', will usually 
exeeerj l\\in rate — evf»n for Ly[)ieal Title I rhil- 
flr<'n — anrl trc'atm*'nts will ap|>ear to f)e more* ef- 
fective t fian they re.illy are. 

r ri nf)rni - re f e r ene erj mr)rle I .s , n o - t r ea Linen t ex- 
[)er tat. ions .sliould be ^.eneraLed solc'ly from em[)iri- 
ral percentile norms laf>les. Wlien f:r)nt.rnl ^;roup^, 
are used , actual po.st Lest scores of t lies e 

K,rr)up«? prf)v i fie t he pro[)er ban i .s f nr eva 1 uat i wy^ • 
treatme[>t effects. In the flperrial re^^resslnn 
model, a r e^', ress i r)n line [>/u»ed ou com[)aris<)n yjn\i\) 
tja t a ( n\\ f)e ufw-d to est i ma t e the prjs I test ^^core u . 
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Hazard "12: Mistaken Attribution of Causality 

observed gains may [»rivu resul Le^l v^^n Liie I'iLle 
I tredlnienl, but Liiere ^ire always j)laiisible dller-r 
iia Live fxp lana t ions . The |> Laus i b i4 i t y of lliese a l- 
Le rna t i ve , exj) I ana t ionS slu>uld be catetully examined 
before evaluation resul3a|S <ire attributed to proj-- 
ect' impact, as evaluati/n lla^ards are otten the • 
cause {)i apparent ^^ain.s or iH>n- gains. 

Sometimes project part ie lp<ints Ivarn substan- 
tially mori* tlianwoul^ havL* been expected, but t ht» 
project, p e r s e , is not responsible. Instead, tlie 
jNilns could he a result ot the Hawthorne ett^jct 
( Wti I tehead in wliich special project parti- 

(i pants (\n well simply becaiise they are y^ettin>;, 
special attention. Tlie nalure of iln- treatment 
may not necessa r i ly be im[)or taiU . An oppos i te re- 
siil t may fullo^^ from a.'.iohn Henry ettect (Saret- 
sky, 197*2). In tliis' case, comparison group stu- 
dents w)r^ extm hard to prtjve that they are just 
as good «is p ro ) < ^t .sj^ id en t s , 

(it lie r 1! ke ly causes of ra is 1 ead 1 ng gains <ire 
unrecognized "t reatment s" wl»ich liave* not liing to 
do witd tlie project. Mr>st scliool systems are in a 
constant state of flux w'itl» multiple cli.inges every' 
year. Changes in school programs, personnel, fa- 
cilities, class siz(»s, community characteristics — 
any or all of these factors can afteCt student 
per f (jrmance . Also, the true source ot achievement 
gains is somet imes Improperly ident if ied becaust? 
c\\i 1 d ren a re i livo 1 ved in more than one treatment . 
Under tl»ese coi^dltions, it i^i impossible to deter- 
mine caiisality in an unambiguous majiner. 

AAA 

The t ab I e [>e low I nd i ( a t es wfj i ( h haza rds present 

the biggest threat to validity of the Title I c-val- 
liat i on mode 1 h . 
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evaluati6n hazaros^by models 



Control Norm Special 
Group Group Regression 
Mode 1 Mode 1 , Mode 1 



1. Grade-equivalent 

scores X XX X 

2. Inappropriate adjust- 

ments X 

3. Norm- re f erenced. testing 

on lnappr(;pr la t(» dates x 

^. Inappropriate test 

levels X X X 

3 . Miss I ng test scores . X X ^ X 

6. Noncomparable groups X v , ^ 

t 

1 . Select Ion based on 

pretest scores . X % 



8. Post-hoc matching of 

groups X 

9. Careless testing X ' X X 

10. Noncomparable pre- and 

post tests X 

11. Inappropriate 4)ostte6t 

estimates N/A N/A N/A 

12. Mlst'ilcf»n attribution of 

causality x X X 
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